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Sensitivity Analysis: 

The basis for adjoint model applications 


Adjoints in simple terms 


Adjoint Sensitivity Analysis for a Discrete Model 

The Problem to Consider: 


A possibly nonlinear model: 

y = m(x) 

A differentiable scalar measure of model output fields: 

J = A y) 

The result of input perturbations 

A J = J (x T x r ) — J(x) 


A lst-order Taylor series approximation to A J 



p T 

The goal is to efficiently determine for all i 



Adjoint Sensitivity Analysis for a Discrete Model 

The Tangent Linear Model (TLM) 


Apply a lst-order Taylor series to approximate the model output 



djji/dxj is called either the Resolvant matrix of the TLM or the 
Jacobian of the nonlinear model. 


Approximate A J by a lst-order Taylor series in y' 



i 




A graphical TLM schematic 


NLM TLM 



Adjoint Sensitivity Analysis for a Discrete Model 

The Adjoint Model 

(Adjoint of the TLM or adjoint of the nonlinear model) 


Application of the “chain rule” yields 


dJ 

dxi 



dyj_ dJ_ 

dxi djjj 


Contrast with the TLM 





A. The variables are different in the two equations 

B. The order of applications of the variables related to x and y differ 

C. The indices i and j in the matrix operator are reversed 



Adjoint Sensitivity Analysis 
Impacts vs. Sensitivities 


A single impact study yields exact response measures 
(J) for all forecast aspects with respect to the particular 
perturbation investigated. 

a J = J( y + y') - A y) 

A single adjoint-derived sensitivity yields linearized 
estimates of the particular measure (J) investigated 
with respect to all possible perturbations. 


Adjoint Sensitivity Analysis for a Discrete Model 

Additional Notes 


1. Mathematically, the field dJ/dx is said to reside in the dual space 
of x 

2. With the change of notation x = <9J/dx, M = <9y/<9x, etc., 

J' = y T y / = y T (Mx) = (y T M) x' = (M T y) T x ' = x T x' (11) 


3. The exact definition of the the adjoint depends on the quadratic 
expression used to define J' . If the simple Euclidean norm (or dot 
product) is used, then for a discrete model, the adjoint is simply 
a transpose. Such a simple norm may not be appropriate when 
the dual space fields are to be physical interpreted. (More on this 
later.) 

4. The adjoint is not generally the inverse: in non-trivial atmo- 
spheric models, M T ^ M -1 . 

5. This is all lst-year calculus and linear algebra. If examination of 
gradients is useful, then so are the adjoint models used to calculate 
them. 



Adjoint Sensitivity Analysis for a Discrete Model 

Example Equations 


Nonlinear model: 

du du 

dt dx 

Discrete NLM (superscript t index, subscript x index) 
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TLM (linearized about time and space varying solution u) 
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Adjoint model: 


K = u ” +1 


— [«+i 


2(Ax) 


~n \ -n+1 _i_ -n+i _ ~n - 


)] 


n+1' 
i+l . 



Warning 


Although the previous description of an adjoint for a 
discrete model is correct, it fails to adequately account 
for some issues regarding the discrete representation of 
physically continuous fields. 

As long as the interpretations of sensitivity concern the 
given model and resolution or the applications of gradients 
concern some classes of optimization problems, this 
“failure” does not apply. 


Examples of Adjoint-Derived Sensitivities 


T 



Example Sensitivity Field 


aj,/3z 


for t= — 36. cr=0.40 
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Contour interval 0.02 Pa/m M=0.1 Pa/m 


Lewis et al. 2001 
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hPa 


500 

hPa 


1000 

hPa 


Sensitivity field for J=p s with respect to T for an idealized cyclone 
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1000 Km' 

From Langland and Errico 1996 MWR 


EAST 


Development of Adjoint Model Software 


First consider deriving the TLM and its adjoint model codes 

directly from the NLM code 


Why consider development from code? 


1. Eventually, a TLM and adjoint code will be necessary. 

2. The code itself is the most accurate description of the model algorithm. 

3. If the model algorithm creates different dynamics than the original equations 
being modeled, for most applications it is the former that are desirable and 
only the former that can be validated. 


Development of Adjoint Model From 
Line by Line Analysis of Computer Code 

Automatic Differentiation 


TAMC 

TAF 

ADIFOR 

TAPENADE 

OPENAD 

Others 


Ralf Giering (superceded by TAF) 

FastOpt.com 

Rice University 

INRIA, Nice 

Argonne 

www. autodiff. org 


Development of Adjoint Model From 
Line by Line Analysis of Computer Code 

1 . TLM and Adjoint models are straight-forward (although tedious) 
to derive from NLM code, and actually simpler to develop. 

2. Intelligent approximations can be made to improve efficiency. 

3. TLM and (especially) Adjoint codes are simple to test rigorously. 

4. Some outstanding errors and problems in the NLM are typically 
revealed when the TLM and Adjoint are developed from it. 

5. Some approximations to the NLM physics considered are 
generally necessary. 

6. It is best to start from clean NLM code. 

7. The TLM and Adjoint can be formally correct but useless! 


Nonlinear Validation 


Does the TLM or Adjoint model tell us anything about 
the behavior of meaningful perturbations in the nonlinear 
model that may be of interest? 


Linear vs. Nonlinear Results in Moist Model 


24-hour S V 1 from case W 1 
Initialized with T’=1K 

Final ps field shown Errico and Raeder 

1999 OJRMS 


linear nonlinear 



Contour interval 0.5 hPa 


Linear vs. Nonlinear Results in Moist Model 


linear nonlinear 





Linear vs. Nonlinear Results 

In general, agreement between TLM and NLM results 
will depend on: 

1 . Amplitude of perturbations 

2. Stability properties of the reference state 

3. Structure of perturbations 

4. Physics involved 

5 . Time period over which perturbation evolves 

6. Measure of agreement 


The agreement of the TLM and NLM is exactly 
that of the Adjoint and NLM if the Adjoint is exact 
with respect to the TLM. 


Problems with Physics 


1 . The model may be non-differentiable. 

2. Unrealistic discontinuities should be smoothed after 

reconsideration of the physics being parameterized. 

3. Perhaps worse than discontinuities are numerical insta- 

bilities that can be created from physics linearization. 

4. It is possible to test the suitability of physics components 

for adjoint development before constructing the adjoint. 

5. Development of an adjoint provides a fresh and 

complementary look at parameterization schemes. 


Efficient solution of optimization problems 


Optimal Perturbations 

Type I 


Maximize J' = Yli x \ 

Given the constraint: C = \ Yli w i x 'i 

Solution Method: Minimize the augmented variable 



Solution: 



x! i (optimal) 


A dJ 

Wi dxi 




Optimal Perturbations 

Type II 


Minimize C — r } Yi w % x 'i 

Given the constraint: J f = Yi jjf~ x 

Solution Method (as before) 

Solution: 

A dJ 

wi dxi 


x\ (optimal) 




Optimal Perturbations 

Singular Vectors 


Maximize the L2 norm: 
Given the TLM: 


N = iy /T Ny' 
y' = Mx' 


And the constraint: 1 = C = rjx ,T Cx' 

Solution Method: Minimize the augmented variable / ( x' l 


I = -x ,t M t NMx / + A I 2 

2 



dl_ 

3x' 


= M t NMx' - A 2 Cx' 


For z = C 2 x' , the solution is an eigenvalue problem 


A 2 z = C“2M t NMC“2 Z 



Optimal Perturbations 

Additional Notes Regarding SVs 


1. A are the singular values of the matrix N^MC 2 . 

2. The set of x' form an orthonormal basis with respect to the norm 
C. 

3. If C and N are the Euclidean norm I, then x 6 7 = z are the right 
(or initial) singular vectors (or SVs) of M and y' = Mx ; are 
the left (or final or evolved) singular vectors of M. The same 
terminology is used even for more general norms. 

4. A 2 = N/C for each solution. 

5. If C is the inverse of the error covariance matrix, then the evolved 
SVs are the EOFs (or PCs) of the forecast error covariance, and 
truncations using the leading SVs maximize the retained error 
variance. (Ehrendorfer and Tribbia 1997 JAS) 

6. The SVs and A 2 depend on the norms used; i.e., on how measure- 
ments are made. This dependency is removed only by introducing 
some other constraint or condition. 

7. SVs produced for semi-infinite periods are equivalent to Lyupanov 
vectors (Legras and Vautard, 1995 ECMWF Note). 



The more general nonlinear optimization problem 


Find the local minima of a scalar nonlinear function J(x). 


dJ /Ox 

Gradient Contours of J 



The Energy Norm 




RT r 

Psr 



dA da 


Errico, R.M., 2000: Interpretations of the total energy and rotational energy 
norms applied to determination of singular vectors. Quart. J. Roy. Meteor. 
Soc., 126A, 1581-1599. 



Problems with Physics 


Tangent linear vs. nonlinear model solutions 



Errico and 
Raeder 1 999 
QJRMS 


Problems with Physics 

Parameterization of Vertical Eddy Diffusion 


NLM: 


du 1 d du 

dt + pdz^ dz 

The K are flow-dependent eddy diffusion coefficients. 


TLM: 


du' 

~di 



+ terms for p 


Usually a semi-implicit treatment of du/dz is used to greatly 
increase numerical stability. This appear to work in the NLM but 
is insufficient in the TLM. 

Instead, the K ' term is generally ignored! 



Problems with Physics 

Consider Parameterization of Stratiform Precipitation 



Example of a potentially worse problem introduced by smoothing 



x 


Problems with Physics 


1 . The model may be non-differentiable. 

2. Unrealistic discontinuities should be smoothed after 

reconsideration of the physics being parameterized. 

3. Perhaps worse than discontinuities are numerical insta- 

bilities that can be created from physics linearization. 

4. It is possible to test the suitability of physics components 

for adjoint development before constructing the adjoint. 

5. Development of an adjoint provides a fresh and 

complementary look at parameterization schemes. 


Other Important Considerations 


Physically-based norms and the interpretations of 
sensitivity fields 


d (error “energy”) / d (Tv 24-hours earlier) 


1 x 1.25 degree lat-lon 



0.5 x 0.0625 degree lat-lon 
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Sensitivities of continuous fields 


Consider J(f(x)) where J is a scalar function of a set /*• of continuous fields 
represented by the vector f , each defined within a multidimensional space x. 
Then, the real functional expression 





should be interpreted as 


dS(x) 7 ^r(x) 6fi(x) 


where S is a volume, mass, or other metric. With this interpretation, dJ/dfi 
has physical units of J x /r 1 x 5 _1 ; i.e., it is a kind of sensitivity density. 


This field of sensitivity density is relatively independent of the grid on which 
it is represented, but to estimate the change of J due to a perturbation <5f 
applied at grid point xg, the grid volume dS at this point must be considered; 





It is safer to base physical interpretations of sensitivity on its density, but 
then sensitivities to grid point perturbations become less obvious. 


Sensitivity of J with respect to u 5 days earlier at 45°N, 
where J is the zonal mean of zonal wind within a narrow 
band centered on 10 hPa and 60°N. (From E. Novakovskaia) 
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Rescaling options for a vertical grid 



2 Re-scalings of the adjoint results 


Mass weighting 


Volume weighting 
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From E. Novakovskaia 


Summary 


Misunderstanding #1 


False: Adjoint models are difficult to understand. 

True: Understanding of adjoints of numerical models 
primarily requires concepts taught in early 
college mathematics. 


Misunderstanding #2 


False: Adjoint models are difficult to develop. 

True: Adjoint models of dynamical cores are simpler 
to develop than their parent models, and almost 
trivial to check, but adjoints of model physics 
can pose difficult problems. 


Misunderstanding #3 


False: Automatic adjoint generators easily generate 
perfect and useful adjoint models. 

True: Problems can be encountered with automatically 
generated adjoint codes that are inherent in the 
parent model. Do these problems also have a 
bad effect in the parent model? 


Misunderstanding #4 


False: An adjoint model is demonstrated useful and 
correct if it reproduces nonlinear results for 
ranges of very small perturbations. 

True: To be truly useful, adjoint results must yield 
good approximations to sensitivities with 
respect to meaningfully large perturbations. 
This must be part of the validation process. 


Misunderstanding #5 


False: Adjoints are not needed because the EnKF is 

better than 4DVAR and adjoint results disagree 
with our notions of atmospheric behavior. 

True: Adjoint models are more useful than just for 

4DVAR. Their results are sometimes profound, 
but usually confirmable, thereby requiring new 
theories of atmospheric behavior. It is rare that we 
have a tool that can answer such important questions 
so directly ! 


What is happening and where are we headed? 


1. There are several adjoint models now, with varying 

portions of physics and validation. 

2. Utilization and development of adjoint models has been 

slow to expand, for a variety of reasons. 

3. Adjoint models are powerful tools that are under-utilized. 

4. Adjoint models are like gold veins waiting to be mined. 

5. Validity of some effects remains questionable. 
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